Less is more: Eliminating index terms from subordinate clauses
نویسندگان
چکیده
We perform a linguistic analysis of documents during indexing for information retrieval. By eliminating index terms that occur only in subordinate clauses, index size is reduced by approximately 30% without adversely affecting precision or recall. These results hold for two corpora: a sample of the world wide web and an electronic encyclopedia.
منابع مشابه
Input-induced Variation in EFL Learners’ Oral Production in Terms of Complexity, Accuracy, and Fluency
Researchers have extensively studied phenomena that affect a second language learner’s oral production while there is scant evidence about input-related factors. Accordingly, the present study sought to investigate how variation in oral production is caused by the input they receive from different course materials. To this end, the study included a micro-evaluation study of three course materia...
متن کاملNeural correlates of Dutch Verb Second in speech production.
Dutch speakers with agrammatic Broca's aphasia are known to have problems with the production of finite verbs in main clauses. This performance pattern has been accounted for in terms of the specific syntactic complexity of the Dutch main clause structure, which requires an extra syntactic operation (Verb Second), relative to the basic Subject-Object-Verb order surfacing in Dutch subordinate cl...
متن کامل1 Subordination at the Interface : the Quasi - Subordination Hypothesis
The subordination of clauses, and the recursion which results, lies at the heart of linguistic representation. Yet there are many aspects of subordination that are not fully understood. An enduring puzzle is that some clauses which appear to be structurally subordinate nevertheless pattern with main clauses in certain, but not all, respects. This paper posits that such clauses, which we term “q...
متن کاملLearning Preference of Dependency between Japanese Subordinate Clauses and its Evaluation in Parsing
Utsuro et al., 2000) proposed statistical method for learning dependency preference of Japanese subordinate clauses, in which scope embedding preference of subordinate clauses is exploited as a useful information source for disambiguating dependencies between subordinate clauses. Following (Utsuro et al., 2000), this paper presents detailed results of evaluating the proposed method by comparing...
متن کاملStudy on the English Corresponding Unit of Chinese Clause
This paper annotates the English corresponding units of Chinese clauses in Chinese-English translation and statistically analyzes them. Firstly, based on Chinese clause segmentation, we segment English target text into corresponding units (clause) to get a Chinese-to-English clause-aligned parallel corpus. Then, we annotate the grammatical properties of the English corresponding clauses in the ...
متن کامل